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Thermal conductivity of isolated single molecule DNA fragments is of importance for nanotechnology, but 
has not yet been measured experimentally. Theoretical estimates based on simplified (ID) models predict 
anomalously high thermal conductivity. To investigate thermal properties of single molecule DNA we have 
developed a 3D coarse-grained (CG) model that retains the realism of the full all-atom description, but is sig- 
nificantly more efficient. Within the proposed model each nucleotide is represented by 6 particles or grains; the 
grains interact via effective potentials inferred from classical molecular dynamics (MD) trajectories based on a 
well-established all-atom potential function. Comparisons of 10 ns long MD trajectories between the CG and 
the corresponding all-atom model show similar root-mean-square deviations from the canonical B-form DNA, 
and similar structural fluctuations. At the same time, the CG model is 10 to 100 times faster depending on the 
length of the DNA fragment in the simulation. Analysis of dispersion curves derived from the CG model yields 
longitudinal sound velocity and torsional stiffness in close agreement with existing experiments. The computa- 
tional efficiency of the CG model makes it possible to calculate thermal conductivity of a single DNA molecule 
not yet available experimentally. For a uniform (polyG-polyC) DNA, the estimated conductivity coefficient is 
0.3 W/mK which is half the value of thermal conductivity for water. This result is in stark contrast with esti- 
mates of thermal conductivity for simplified, effectively ID chains ("beads on a spring") that predict anomalous 
(infinite) thermal conductivity. Thus, full 3D character of DNA double-helix retained in the proposed model 
appears to be essential for describing its thermal properties at a single molecule level. 



I. INTRODUCTION 

Heat conductivity of nanostructures is of great importance 
both from fundamental and applied points of view. For ex- 
ample, superior thermal conductivity has been observed in 
graphene [1, 2] and carbon nanotubes [3], which has raised an 
exciting prospect of using these materials in thermal devices 
[4—8]. Generally, one can not expect that bulk thermal prop- 
erties of a material will remain unchanged at the nanoscale: 
in some nano materials such as silicon thermal conductivity is 
about two orders of magnitude smaller than that of bulk crys- 
tals [9], with the reduction in conductivity attributed to strong 
inelastic surface scattering. Furthermore, some familiar phys- 
ical laws such as Fourier's law of heat transfer that work in 
bulk materials are no longer valid on the nanoscale [10-13]. 

Deoxyribonucleic acid (DNA) is one of the most promis- 
ing nanowire materials due to the relative ease of modifica- 
tions combined with the self-assembly capability which make 
it possible to construct a great variety of DNA-based nanos- 
tructures [14, 15]. While electrical conductivity of single 
DNA molecules has been extensively studied, the correspond- 
ing thermal properties remain largely unexplored. The first, 
and to the best of our knowledge the only published work so 
far that attempted to measure thermal conductivity of single 
molecule DNA - DNA-gold composite [16] - gave an esti- 
mate of 150 W/mK for the coefficient of thermal conductivity, 
which was conspicuously close to that of pure gold. The study 
concluded that molecular vibrations play a key role in thermal 
conduction process in DNA molecule, but thermal conductiv- 
ity of single molecule DNA remained unknown. 

At the same time, theoretical approaches to the problem 
have met with their own difficulties. Numerical modeling of 
heat transfer along carbon nanotubes and nanoribbons showed 
that thermal conductivity increases steadily with the length 
of the specimen [10-13]. If one makes an analogy with ID 



anharmonic chains that always have infinite thermal conduc- 
tivity [17, 18], one might interpret these results as suggest- 
ing anomalously high thermal conductivity for quasi one- 
dimensional nanosystems. Since at some level the DNA 
double helix may also be considered as a quasi ID sys- 
tem, one wonders if the corresponding thermal conductivity 
is also anomalously high, increasing with the length of the 
DNA molecule? It is possible that over-simplified "beads-on- 
spring" models of DNA are inappropriate in this context, and 
thermal properties of the real double helix do not exhibit the 
low dimensional anomaly in heat conductivity. 

The goal of this work is to investigate heat conductivity 
of single molecule DNA by direct modeling of heat transfer 
along the double helix via classical molecular dynamics of 
the DNA. To accomplish this goal we will have to choose a 
level of detail that is computationally feasible but at the same 
time retains key properties of the fully atomistic picture of the 
molecule. 

Classical molecular dynamics (MD) simulations based on 
fully atomistic (all-atom) representations [19-21](see Fig. 1) 
are among the most widely used tools currently employed 
to study dynamics of the DNA double helix [22]. In these 
simulations the dynamics of the atoms is governed by semi- 
empirical potentials, or force-fields; CHARMM27 [20, 21] 
or AMBER [23] are the most common force-fields that accu- 
rately reproduce a variety of structural and dynamical proper- 
ties of small fragments of canonical and non-canonical nucleic 
acids in water, at least on time-scales of up to one microsec- 
ond [22, 24-32]. Importantly, classical force-fields such as 
AMBER [33] can reproduce high-level quantum mechanical 
calculations for hydrogen bonding and base stacking interac- 
tions [34, 35]. However, accuracy of these all-atom models 
in which every atom of the DNA fragment and all of the sur- 
rounding solvent molecules are represented explicitly comes 
at a price of substantial computational expense that limits the 



range of applicability of the models. 

The so-called implicit solvent approach [36^-0] reduces the 
computational expense by replacing the discrete water envi- 
ronment with a continuum with dielectric and "hydrophobic" 
properties of water. The solvent degrees of freedom are "in- 
tegrated out" and the corresponding free energy term is added 
to the Hamiltonian of the system. However, even in this case 
all-atom simulations may be computationally expensive. For 
example, a single 5 ns long simulation of a 147 base pair DNA 
fragment reported in Ref. [41] took 1 15 hours on 128 proces- 
sors. This example suggests that all-atom models may not be 
suitable for the program set out in this work, in which heat 
transfer along long fragments of DNA will have to be exam- 
ined. We therefore resort to yet another level of approxima- 
tion - coarse-graining (CG), where sets of original atoms are 
grouped into single "united atoms" particles or grains. 

The remainder of this work is organized as follows. We be- 
gin with an outline of the coarse-graining procedure leading 
to the proposed model, followed by a description of the po- 
tential function. Details are provided in the Appendix. We 
validate the model by comparing its dynamics with that of the 
corresponding all-atom model. Small amplitude vibrations 
and dispersion curves are analyzed next, leading to an addi- 
tion verification of the model by comparison of several pre- 
dicted characteristics (speed of sound, torsional rigidity) with 
the experiment. Then, we describe in detail the formalism 
used to model the heat transfer along a single DNA molecule. 
In "Conclusion" we provide a summary of the results and a 
brief discussion. 



II. THE COARSE-GRAINED MODEL OF DOUBLE 
HELICAL DNA 

Naturally, there is no unique prescription for subdividing a 
macromolecule into grains. The grouping of individual atoms 
into grains aims to achieve a balance between faithful repre- 
sentation of the underlying dynamics and the associated com- 
putational expense which is directly related to the number of 
grains retained in the CG description. A fairly large number of 
coarse-grain DNA models has been developed [42-61]. Many 
of these models are phenomenological - each nucleotide is 
represented by 1 to 3 grains interacting via relatively simple 
pair potentials designed to reproduce either certain set of ex- 
perimental properties or the results of numerical simulations 
based on the corresponding all-atom models. However, the 
oversimplified description of the nitrogen bases carries the 
risk of losing some key details of the base-base interactions, 
particularly their stacking part, that affects intramolecular re- 
arrangements. The latter plays a very important role in heat 
transfer along the DNA molecule [62]. To make sure the ni- 
trogen bases are treated as accurately as possible within the 
CG description, we follow a strategy in which each base is 
modeled by three grains; the interaction between the bases is 
modeled at the all-atom level via a computationally effective 
strategy described below. 

Within the coarse-grain model each nucleotide is repre- 
sented by 6 coarse-grained particles, or grains: 1 for the phos- 




FIG. 1: View of a DNA fragment (CGTTTAAAGC) for (a) stan- 
dard all-atom representation of the double helix and (b) the pro- 
posed coarse-grained model (12CG) based on 12 united atom par- 
ticles (grains) per base pair. 



phate group, 2 for the sugar ring, and 3 for the nitrogen base. 
The mass of each coarse grain equals the net mass of the orig- 
inal atoms that make up that grain; for the 3 base grains the 
original mass is distributed between them as described in the 
Appendix. The fine-level to coarse-grain reduction employed 
by our model is shown in Fig. 2. Following Bruant et al. [42], 
where all-atom molecular simulations were used to identify a 
set of relatively rigid groups of atoms in the DNA, all of the 
original atoms of the phosphate and C5' groups [atoms P, OIP, 
02P, 03', 05', C5', H5'l, H5'2, see Fig. 2] are combined into 
a single [P] grain which is placed at the position of the original 
P atom. 

The sugar groups are described by two grains which are 
placed on the original C3' and CI' atoms; they will be denoted 
as [C3] and [CI]. The grain [C3] includes C3', H3', C4' and 
H4' original atoms, the grain [CI] includes original CI', HI', 
C2', H2'l, H2'2 and 04' atoms. Thus, within our coarse- 
grain model the backbone of the double helix is represented 
by a chain of 3 particles (grains) [P], [C3] and [CI] (see Fig. 
2). 

Nitrogen bases (A, T, G and C) are rather rigid, planar struc- 
tures; spatial position and orientation of each base can be 
uniquely determined from positions of any three atoms that 
belong to that base. Therefore, bases A, T, G, C will be de- 
scribed in terms of three grains. For the A base, we identify 
the three grains with the original C8, N6, C2 atoms; for the T 
base, the three atoms are C7, 04, 02; for the G they are C8, 
06, N2 atoms; and for the C base, they are C6, N4, and 02 
original atoms. Thus within the suggested model one base- 
pair (bp) of the DNA double helix consists of 12 grains - we 
call the model "12CG" [see Fig. 1 (b)]. For N base-pair 
double helix, our system will consist of 12A particles. Note 
that within our terminology the simplest possible "beads-on- 
spring" model would be called "ICG" (one grain per base 
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FIG. 2: Combining original atoms into coarse grains on the DNA 
backbone. Dashed lines indicate atoms that are included in the cor- 
responding grain, solid circles mark the atoms on which the grain is 
centered. 

pair), and the all-atom representation would be "40CG" , al- 
though in this case the exact number would depend slightly on 
the base sequence e. 

Interactions between neighboring base pairs are obviously 
very important for heat transfer along the DNA molecule. So 
within the framework of our coarse-grained model the stack- 
ing of neighboring base pairs should be taken into account 
as accurately as possible. We take advantage of the planar 
structure of the bases to bring the accuracy of the stacking in- 
teractions close to the all-atom level, but with little additional 
computational expense: from the known grain coordinates of 
each coarse-grain base, one can trivially restore coordinates 
of all of the original atoms in the base with virtually no addi- 
tional computational expense. We then uses these coordinates 
to calculate the stacking energy using accurate all-atom po- 
tentials, see Appendix for details. 

m. THE POTENTIAL FUNCTION 

To describe interactions between the grains, we employ a 
potential function that contains all of the "standard" terms 
used in classical molecular dynamics simulations [63, 64]. 
These terms include internal energy contributions such as 
bond stretching and angle bending, short-range van der Waals 
(vdW) interactions, and long-range electrostatic interactions 
in the presence of water and ions. The latter are modeled im- 
plicitly, at the continuum dielectric, linear response level. The 
detailed term by term description of the potential is given in 
the Appendix. 

The total energy of the system consists of nine terms: 

H = Ek+E v +Ei, + E a +Et + Ehb+E st +E e i+E v( iw- (1) 



The first term Ek stands for kinetic energy of the system, 
the terms E v , E a , E t describe respectively bond, angle and 
torsion deformation energy of the backbone. The term Eb 
stands for base deformation energy and was introduced to hold 
four points - CI' and three points on a nitrogen base - near 
one plane. Last two terms E e i , E v dw describe electrostatic 
and van der Waals interactions between grains on the back- 
bone. Interaction between nitrogen bases, including interac- 
tions along the same chain (stacking) as well as interactions 
across the complementary chains (including hydrogen bonds 
between complementary bases), are described by two terms 
E st and Em,. These two potentials depend on coordinates 
of all of the original atoms of the base. These coordinates 
are uniquely calculated from positions of the three grains that 
form each base; the reader is referred to Appendix for details. 
A fortran implementation of the model is freely available at 
http://people.cs.vt.edu/onufriev/software 

IV. VALIDATION OF THE MODEL 

We begin validating the proposed coarse-grain model by 
comparing the resulting DNA dynamics with that produced 
by the corresponding well-established all-atom model. Later 
in this work we will also discuss direct comparisons with the 
experiment (estimated sound velocities). 

In what follows we use following notation for convenience: 
x n.j,j = 1, • • • j 12 are coordinates of 12 grains on the n- 
th base-pair of the double helix (see Fig. 3). Therefore, the 
configuration of n-th base-pair is given by a 36-dimensional 
coordinate vector u„ = {~X- n ,j}j^i- The constant tempera- 
ture dynamics of the double helix is obtained by integrating 
numerically the following system of Langevin's equations: 

M„ii n = -dH/du n -TM n ii n + E n , (2) 

where n = 1,2, N, T = l/t r is the Langevin collision 
frequency with t r = 1 ps being the corresponding particle re- 
laxation time, M„ is a diagonal matrix of grain masses of n-th 
base-pair, and 5„ = {£,n,k}k=i is a 36-dimensional vector of 
Gaussian distributed stochastic forces describing the interac- 
tion of n-th base-pair grains with the thermostat with correla- 
tion functions 

(M*i)W*2)) = 2MTk B TS nm S ij 5(t 2 -h), 

where the mass M = M k , if i = 3(fc - 1) + 1, k = 1, 12, 
I = 1,2,3. 

To bring the temperature of the molecule to the desired 
value T = 300K , we integrate the system (2) over time 
t = 20t r starting from the following initial conditions 

{u n (0)=u°, ii n (0)=O}£U 0) 

that correspond to the equilibrium state of the double helix 
{u°}„ =1 . Once the system is thermalized, the temperature is 
maintained at T = 300A' and the trajectory continues for 10 
ns. 

The first step in the validation procedure is to estimate root- 
mean-square deviation (RMSd) of the end point (t=10 ns) of 
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FIG. 3: Fragment of the DNA double helix in the coarse-grained 
representation. Base-pairs n and n + 1 are shown. 



the trajectory from a reference DNA structure, and compare 
the RMSd values between the CG and the reference all-atom 
trajectory (AMBER). Given two structures, the RMSd can be 
computed as: 



1 



12 N SeSO(3), 
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where Ti,i = 1, 12 AT is the reference (e.g., initial), and 
r[ is the final set of coordinates of the structure. The expres- 
sion is minimized over a translation (vector 1) and a rotation 
around a fixed point (operator S). The details of the algorithm 
are described in the Ref. [80]. Analysis of RMS deviations 
from reference structures as a function of simulation time is 
commonly used as initial check of stability of the system and 
quality of the underlying models [66, 68]. 

As is common in the field, the following sequence of 12 
base pairs d(CGCGAATTGCGC) 2 (Dickerson's dodecamer) 
was used for this test; experimental X-ray structure of this 
B-DNA fragment is available. A constant temperature (T = 
300iO simulation was performed for 10 ns. As one can see 
from the Fig. 4 the various RMSd metrics fluctuate around 
their equilibrium values, which suggests that the system re- 
mains stable in dynamics, on the time scale of the simulation. 
A comparison with the corresponding all-atom simulation is 
shown in Fig. 4 (b). This all-atom simulation uses the same 
12 base-pair fragment, and is based on the latest nucleic acid 
force-field (parmbscO [23]) from AMBER. The solvent was 
represented via the generalized Born implicit solvent approx- 
imation; all other parameters such as Langevin collision fre- 
quency, ambient salt concentration, etc. were the same as in 
the CG simulation shown in Fig. 4 (a). Comparing Figs. 4 (a) 
and (b) we can see that the all-atom RMSd is slightly larger 
than that of the 12CG models. We can conclude that the 12CG 
model is somewhat more rigid as compared with all-atom one. 
Finally, we note that the equilibrium RMS deviation from the 
experimental (X-ray) B-form DNA is about 2.5A , Fig. 4 (c), 




FIG. 4: Comparison of time dependence of RMS deviation relative to 
various reference structures in coarse-grained and all-atom molecular 
dynamics simulations of a 12 base-pair DNA fragment at T=300K. 
(a) 12CG model simulation. RMSd is relative to the first frame, (b) 
All-atom model simulation. RMSd is relative to the first frame, (c) 
12CG model simulation. RMSd is relative to B-DNA X-ray structure 
[81]. For all-atom structures the RMSd is computed only for the 
subset of atoms that define grain centers in the corresponding CG 
model. 



which is similar to what was observed earlier in all-atom im- 
plicit solvent simulations [68]. 

Another common set of structural parameters used in vali- 
dation of DNA models is helical parameters. These parame- 
ters determine the interaction between neighboring base pairs, 
hence they are significant for heat transfer processes. Let's 
choose, for simplicity, two of them which are the most rele- 
vant ones for describing the over-all structure of the double 
helix. The first of these parameters is the angle <fi, called twist, 
through which each successive base pair is rotated around the 
helical axis relative to its (nearest neighbor) predecessor. The 
second one, rise, is the distance between such two neigh- 
boring base pairs. Given the structure of a single nucleotide 
and the values of the twist and rise, one can re-construct the 
whole molecule assuming that it is a "one-dimensional" uni- 
form crystal. Exact algorithm of calculating these parame- 
ters is described in [82]. We used X3DNA [82] package and 
in-house software for computing these parameters in our all- 
atom and CG models. With regards to twist and rise, the vali- 
dation of our 12CG model was performed in the same manner 
as previously described in the context of an all-atom model 
[66]. The results are presented in Fig. 5, where the averages 
of the 10ns simulation trajectories and the standard deviations 
(indicated by error bars) for each base pair step are shown. 
One can see that the twist and rise values for 12CG model are 
rather close to those of the all-atom model. A small differ- 
ence is comparable with that seen between DNA simulations 
in explicit vs. implicit solvent [66]. 
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FIG. 5: Comparison of two common helical parameters (a) Az (rise) 
and (b) A(f> (twist) between the CG model (curves 1, 3) and the cor- 
responding all-atom model (curves 2, 4) (n - number of base pair 
step). Shown are averages over the corresponding 10ns molecular 
dynamics trajectories at T=300K. 



V. THE DISPERSION CURVES AND SMALL-AMPLITUDE 
OSCILLATIONS 

The proposed 1 2CG model enables one to compute dynam- 
ical evolution of a DNA molecule with any base sequence. 
However, for homogeneous molecules, that is if all base pairs 
are identical, the molecule can be considered as quasi-one- 
dimensional crystal with the elementary cell being one nu- 
cleotide pair of the double helix. This is a very useful sim- 
plification that will be employed here; it is also a very rea- 
sonable one as long as the focus is on the over-all physics of 
the structure, not on sequence dependent effects. The main 
advantage of the homogeneity assumption is that linear oscil- 
lations can be analyzed by standard techniques of solid state 
physics. To be specific, let's consider a poly-G double heli- 
cal chain, assumed to extend along the z-axis. In the ground 
state of the double helix, each successive nucleotide pair is ob- 
tained from its predecessor by translation along the z-axis by 
step Az and by rotation around the same axis through helical 
step Acf>. These are the rise and twist parameters introduced 
in the previous section. 

x n ,j,i = Zn-ij,i cos(A0) - x n -i,j,2 sin(A<£), 
%n,j,2 = %n-\,j,i sin(A0) - z„_i j, 2 cos(A^), (4) 

Xn,j,S — Xfi— X,j,3 ~r" Az 

Thus, the energy of the ground state is a function 
of 38 variables: { x i,j}jii> A0, Az, where xi^ = 
(xi x\ j. 2, £i,j,3) is the vector position of j-th grain of the 
first nucleotide pair. 



Finding the ground state amounts to the following mini- 
mization problem: 

E Q = E V + ... + E vdW -> min : {xij}}^, A0, Az, (5) 

where the sum extends over one nucleotide pair n = 1, and the 
relation (4) holds for calculation of the energies E v ,...,E v dw- 

Numerical solution of the problem (5) has shown that the 
ground state of poly-G DNA corresponds to the twist value of 
A</>o = 38.30°, and the rise value (z-step) of Azo = 3.339A. 
It should be noticed that if all of the long-range interaction 
were omitted, i.e., without two last terms E„ and E v $w m 
the Hamiltonian (Bl), the helical step values would change 
only slightly, by about 1 per cent: A^o = 38.03°, Azo = 
3.309A. Thus, Ion g-range electrostatic interactions between 
the charged group result in the relative elongation of the chain 
by only about 1 per cent. Parameters of the double helix com- 
puted within our model differ only slightly from the "canon- 
ical" parameters of the B -conformation of a (heterogeneous) 
DNA double helix in the crystal form [83], for which the aver- 
age twist angle is A</> = 34° 36°, and average rise per base 
pair is Az = 3.4A. 

To find the ground state of the homogeneous double he- 
lix under tension, it is necessary to minimize (5) under the 
fixed value of longitudinal step Az. As a result, one can 
obtain the dependence of the homogeneous state energy on 
the longitudinal step. This function E (Az) has a minimum 
when Az = Azo, which corresponds to the B-conformation 
of the double helix. Longitudinal stiffness of the helix K z = 
d 2 Eo/dAz 2 \Az - Specifically, within our model we esti- 
mate K z = 16 N/m. Since the energy Eq which is being 
derived is normalized to one nucleotide pair one can calcu- 
late the stretching modulus S = K z Azq = 16 N/m x3.4 
A= 5440pN. This estimate is somewhat higher than the cor- 
responding estimates of 1530 ■ ■ ■ 3760 pN obtained from fluc- 
tuations of distances between base pairs observed in MD 
simulations [42]. The relatively larger value of K z from our 
CG model is consistent with the model's over-all larger stiff- 
ness relative to the all-atom description, see a discussion 
above. Some of the difference between the two estimates may 
also be due to methodological differences in estimating lon- 
gitudinal stiffness. Values of the stretching modulus derived 
from experiments are of the order 1000 pN[84— 86], i.e., about 
5 times smaller than our estimate based on the CG model. One 
should keep in mind, however, that we have obtained only an 
upper estimate for the stretching modulus: temperature was 
assumed to be zero, the calculations were based on a homoge- 
neous poly-G-poly-C sequence that was reported to be more 
rigid than inhomogeneous and poly-A-poly-T sequences used 
in experiments [87, 88], and the entropy component was not 
considered in our calculations. 

To obtain Eo(A<f>), that is the dependence of the helix en- 
ergy on the helical step A<f>, we set Az = Azo in (5) and 
perform the minimization with respect to the remaining 36 
parameters. Then, torsion stiffness of the double helix = 
Az d 2 E /dA(t) 2 \ Arj)0 . Our estimate, K$ = 5.8 x 10~ 28 
J-m, is in good agreement with the experimental value of 
Ktj, = 4.1 ± 0.3 x 10~ 28 J-m, obtained for DNA macro- 
molecule in B-conformation [89]. 
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For analysis of small-amplitude oscillations of the dou- 
ble helix it is convenient to use local cylindrical coordinates 
v„,j = (v n ,j,i, v n ,j,2, v n ,j,3), given by the following expres- 
sions: 



V 



ii. j 



1 Bmtpnj + V n ,j,2 COS( 



"J j 



^n.j.a + v n,j,i cos <f> n j + v n j t2 sin <f> n j, (6) 



with x° (n = 0,±1,±2,...; j = 1,2,. ..,12) being coordi- 
nates of the grains in the ground state of the double helix, and 
4> n i being angular coordinate of the grain (n, j). Within these 
new coordinates the molecule's Hamiltonian (Bl) has the fol- 
lowing form: 



-(Mv„,v„ 



P(v„_i,v„,v„ + i) 



(7) 



where v„ = (u„.i, u„.2, Un,i2) is a 36-dimensional vec- 
tor, M is 36-dimensional diagonal mass matrix. Note that the 
last two terms E q and E v dw, responsible for long-range inter- 
action, have been omitted. This simplification is critical from 
the methodological point of view, but has very little impact on 
the accuracy of the estimates of DNA thermal conductivity. 
The point will be discussed below. 

Hamiltonian (7) corresponds to the following system of 
equations of motion: 



- Mv„ = Pi(v„,v„+i,v n+2 ) 

+P2(v„_l, V„, V„+l) + P3(v„-2, V„_l, V„), 



(8) 



where Pj(vi, V2, V3) = dP/dvi, i = 1,2, 3. Within the lin- 
ear approximation, the system (8) has the form 

-Mv n = BiV„+B2V„ + i+B2V„_i+B 3 V„ + 2+53V„_2, 

(9) 

where matrix elements are given by 



Bl — Pll + P-22 + -P337 B2 — P\2 + p23j 

and partial derivative matrix is given by 

d 2 P 



B, = P 



13; 



p. . 

dvidv. 



(0,0,0), i,j = 1,2,3. 



Solution of the system of linear equations (9) can be found 
in the standard form 



v.„ = Aeexp[i(qn — u>t)], 



(10) 



where A is linear mode amplitude, e is unit vector (|e| = 1), 
q G [0, 7r] is dimensionless wave number. Substituting the 
expression (10) into the system (9), we arrive at the following 
36-dimensional eigenvalue problem: 



2 Me = [B x + B 2 exp(iq) + B^ exp(-iq) 
+B3 exp(2iq) + exp(— 2iq)]e. 



(ID 



Thus, to obtain dispersion relations which characterize eigen- 
modes of the DNA double helix, one has to find all eigen- 
values of the problem (11) for each value of wave number 
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FIG. 6: 36 branches of the dispersion curve of homogeneous poly-G 
DNA: (a) high-frequency and (b) low-frequency branches. 



< q < 71". The calculated dispersion curve includes 36 
branches and is shown on the Fig. 6. 

It can be seen from Fig. 6 that frequency spectrum consists 
of low-frequency < oj < 175cm _1 and high-frequency 
10 € [267, 749]cm~ 1 domains. The high-frequency domain 
describes internal oscillations of the bases. As shown in Fig. 6 
(a), corresponding dispersion curves have very small slope, 
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meaning that the high-frequency oscillations have a small dis- 
persion. The low-frequency oscillations have larger disper- 
sion - see Fig. 6 (b). There are two acoustic dispersion curves 
which include zero point (q = 0, u = 0). The first curve 
u>i (q) describes torsional acoustic oscillations, the second one 
u>2(q) describes longitudinal acoustic oscillations of the dou- 
ble helix. Thus we can obtain the two sound velocities 

a v a ,• u 2(q) 

Vt = Az Urn , vi = l\z lim , 

q^>o q 9 ->o q 

with Az being z-step of a double helix. The value of the tor- 
sional sound velocity is v t = 850 m/s, and the value of the 
longitudinal sound velocity is vi = 1790 m/s. One of these 
dispersion curves includes the special point (q — A</>, cj = 0) 
(A(p is the angular helix step). This curve describes bending 
oscillations of the double helix which we do not analyze in de- 
tail because we have so far neglected the long-range interac- 
tions that are known to have strong effect on bending rigidity 
of the DNA. 

The estimated longitudinal sound velocity is in agreement 
with experimental value of the sound velocity in DNA fibers 
[90]: vi = 1900 m/s. Another experimental estimate [91] of 
the same quantity is higher, Vi = 2840 m/s, and was obtained 
from inelastic X-ray scattering. The same work reports tor- 
sional sound velocity v t = 600 m/s; the 20 % discrepancy 
with our estimate of v t = 850 m/s appears acceptable given 
similar margin of error seen between different experimental 
estimates for the longitudinal velocity. 

VL FREQUENCY SPECTRUM OF THE THERMAL 
OSCILLATIONS. 

Let's again consider a homogenous poly-G DNA chain con- 
sisting of N = 200 base pairs and calculate its frequency 
spectrum density. We begin by simulating dynamics of the he- 
lix without taking into account long-range interactions. Later, 
we will turn them on to analyze the effect of making this ap- 
proximation. 

To obtain thermalized state of the double helix, the system 
of Langevin's equations (2) should be numerically integrated. 
For thermalization of the double helix let's consider initial 
conditions corresponding to the ground state (3), and integrate 
the system (2) over time t = 20t r . After the equilibration pe- 
riod, the coupling with the thermostat is switched off, and the 
frequency density p(ui) of the kinetic energy distribution is 
obtained. To increase precision, distribution density was cal- 
culated as an average over all grains of the helix. 

The computed frequency spectrum density at T = 300K 
is shown in the Fig. 7. The spectrum is clearly divided into 
a low-frequency < cj < 175 cm -1 and a high-frequency 
267 < cj < 749 cm -1 domain, consistent with the dispersion 
curves of Fig. 6. 

Simulating the double helix dynamics with account for all 
interactions, including long-range ones, (results not shown) 
yields almost the same frequency spectrum. Only the density 
of oscillations in the interval < u> < 10 cm -1 increases 
somewhat. 




100 200 300 400 

co (cm -1 ) 

FIG. 7: Frequency spectrum density of the DNA double helix ther- 
mal fluctuations at T = 300K. 

VH. HEAT CONDUCTIVITY OF THE DOUBLE HELIX 

For numerical modeling of the heat transfer along the DNA 
double helix, we consider a chain of a fixed length with the 
ends placed in two separate thermostats each with its own tem- 
perature. To calculate the coefficient of thermal conductivity, 
we have to calculate numerically the heat flux through any 
cross section of the double helix. Therefore, first we need to 
obtain a formula for the longitudinal local heat flux. 

Let us consider the homogeneous double helix 
poly-G DNA. (The method below is also applicable to 
any sequences of bases). 

If long-range interactions (electrostatic and van der Waals) 
are not taken into account we can present the Hamiltonian of 
the helix (Bl) in the form 

H = ^ ^(^"'"n) + P ( u n-l,U n ,U n+1 ), (12) 

n 

where the first term describes the kinetic energy of atoms in a 
given cell and the second term describes the energy of inter- 
action between the atoms within the cell and with the atoms 
of neighboring cells. The corresponding equations of motion 
can be written in the form 

Mii n = -Pi(u„,u n+ i,u n+2 ) -P 2 (u n _i,u n ,u n+ i) 

-P 3 (U„_2,U„_1,U„), (13) 

where the function Pj is defined as 

d 

Pj = -P(ui,U2,u 3 ), j = 1,2,3. 

To determine the energy flux through the double helix cross 
section, we re- write formula (12) in a compact form, H = 
h n , where h n is the energy density, 

hn = i(Mu„,u„) + P(u„_i,U„,U„ + i). (14) 

Local longitudinal heat flux j n is defined through local en- 
ergy density h n by the discrete version of the continuity equa- 
tion, 

4: /in =3n —jn+i- (15) 
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Using the energy density (14) and the equations of motion 
(13), we can derive the following relations: 

^-h n = (Mu„, U„) + (Pl,n, Un-l) + (P2,n, U„) 

at 

+ (^3,n, Un+l) = — (-Pl.n+l) U„) — (P3 )n -1, U n ) 
+ (Pl,n, Un-l) + (Pz,rn Un+l)i 



where 



p j,n = -Pj(u„_i,u n ,u n+ i), j = 1,2,3. 



From this and (15) it follows that the energy flux through the 
n-th cross section has the following simple form: 



.1, 



(Pi 



(16) 



Let us note that taking into account long-range interactions 
would complicate this formula considerably, making the cal- 
culations virtually intractable. This is why the approximation 
we have made is critical. 

For a direct numerical modeling of the heat transfer along 
the double helix, we consider a finite structure of the length 
NAz with fixed ends. We assume that the first N. 



+ 



20 

segments are placed in the thermostat at temperature T+ = 
310 K and the last iV_ = 20 segments are placed in the other 
thermostat at Tl = 290 K. The helix dynamics is described 
by the following equations of motion: 



Mii ri 
MiL 



-F n -rMu n + S+, n = l,...,N+, 
-F n , n = N+ + l,...,N-N_, 



(17) 



rMu„ + s: 



N-N- + 1,...,N, 



where F„ = dH/du n , T = l/t r is the damping coefficient 
(relaxation time t r = 1 ps, and E„ = ...,Cfe) is a 36- 
dimensional vector of normally distributed random forces nor- 
malized by the condition 



(£n,i(*0£j(fa)) = 2Mk B T±6 nm 8 ij 6(t 2 - h), 

where the mass M = M k , if i = 3(k - 1) + I, k = 1, 12, 
I = 1,2,3. 

We take the initial conditions (3) corresponding to the equi- 
librium state of the helix. With these initial conditions, we in- 
tegrate the equations of motion (17) numerically, by employ- 
ing the velocity Verlet method with step At = 0.0005 ps. Af- 
ter integration time to [this value depends on the helix length 
between the thermostats, AL = (N — N + — N-)Az], we ob- 
serve the formation of a temperature gradient and a constant 
heat energy flux in the central part of the helix. It is impor- 
tant to notice that the time to can be reduced by modifying 
the initial distribution of the energy, e.g., by taking the initial 
condition for the system (17) as homogeneously thermalized 
state with the mean temperature T = (T + + T_)/2 = 300 K. 

After the stationary heat flux is established, the temperature 
distribution can be found using the formula 



T n = lim — — — 
t-»oo ookfit 



(Mu„(r),u„(r))dr 




310 



300- 



290 




FIG. 8: Distributions of (a) local heat flux J n and (b) local tempera- 
ture T„ in the double helix with length NAz. The input parameters 
are N = 60, temperatures T+ = 310 K and T_ = 290 K, and the 
number of cells in the thermostats, N± = 20. 



and the averaged value of the energy flux along the helix 

Az f f 
J n = lim — / j n {r)dT. 

Distributions of the local energy flux and temperature along 
the helix are shown in Figs. 8 (a) and (b). In the steady-state 
regime, the heat flux through each of the cross section at the 
central part of the helix should remain the same, i.e. J n = J, 
N + < n < N — N-. This property can be employed as a 
criterion for the accuracy of numerical modeling and can also 
be used to determine the characteristic time for achieving the 
steady-state regime and calculation of J n and T n . Figure 8 (a) 
suggests that the flux is constant along the central part of the 
helix indicating that we have reached the required regime. 

At the central part of the helix, we observe a linear gradi- 
ent of the temperature distribution, so that we can define the 
coefficient of thermal conductivity as 



k(N -7V+ -N-) 



(N-N--N+- 1)J 



(Tn + +i — Tj 



N—N. 



)S 



(18) 



where S = ttR 2 is the area of the cross section of the double 
helix (R = 8 A is the radius of helix on phosphorus atoms). 
In this way, the calculation of thermal conductivity is reduced 
to the calculation of the limiting value, 

K = lim k(N). 

In order to determine the coefficient of thermal conductiv- 
ity, we need to know only the dependence of the temperature 
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from base-pair number in the central part of the helix. How- 
ever, a change of the temperature distribution at the edges of 
the helix can also provide some useful information. If the he- 
lix is placed into a Langevin thermostat at temperature T, each 
segment of the helix should have the temperature T n = T due 
to the energy balance of the input energy from random forces 
and the energy lost to dissipation. Then, an averaged energy 
flow from the n-th segment of the helix can be presented as 

r(Mu„,u„) = 36k B T n /t r . 

If only the edges of the helix are placed into thermostat, there 
appears an additional energy exchange with its central part, so 
the energy from the right edge will flow to the left one. As 
a result, the temperature of the left edge is reduced (T„ < 
T + , n = 1,2,..., N + ), whereas the temperature at the right 
edge increases (T n > T_, n = N - N- + 1, N) - see 
Fig. 8 (b). This information allows us to find the energy flux 
in the central part of the double helix using only the energy 
imbalance at the edges, 



Jt r 



N + 



N 



(T+ - T n ) 



(r n -r_). (19) 



Az36kn 

a n=l n=N-N-+l 

If the lengths of the edges placed into thermostat coincide, 
i.e., N + = iV_ = N±, we can rewrite this formula in the 
following simplified form: 



N± 

j = Azl8kB J2(T + 

n— 1 



T_-T n + T N+1 _ n ). (20) 



Equation (19) gives an alternative way to calculate ther- 
mal energy flux J; the equation can be employed to verify 
results obtained via Eq. (16). Let us note that although (16) is 
obtained under the assumption of no long-range interactions, 
formula (20) remains valid also if these interactions are taken 
into account. 

Numerical modeling of the heat transfer shows that both 
formulas lead to the same value of the heat-conductivity co- 
efficient if long-range interactions are absent. When N=80 
(the number of internal links Ni = N — N + - iV_ = 20), 
the heat-conductivity coefficient k = 0.26 W/mK. When 
N = 80 (Ni = 40) - conductivity k = 0.29 W/mK, when 
N = 120 (Ni = 80) - « = 0.27 W/mK, and when N = 200 
(N = 160) - K = 0.28 W/mK. The same values are obtained 
also if the long-range interactions are taken into account (and 
the heat flow is calculated by formula (20) only). These con- 
siderations help us reach the conclusion that the contribution 
of the long-range interactions to the heat transfer along the 
double helix is very minor. 

It is worth noting that the use of formula (20) for calculat- 
ing the value of heat transfer requires more time-consuming 
calculations. Therefore, it is preferable to use formula (16). 
Also, equation (16) allows one to estimate relative contribu- 
tions of various interactions into the process of heat transfer. 
We find that interaction between neighboring base pairs con- 
tributes 32% to the net energy flow, with the rest of the heat 
transfer occurring along the two sugar-phosphate chains. 



As one can see from the results, the value of heat con- 
ductivity k in the DNA macromolecule does not depend on 
the length of the molecule. This is normal thermal conduc- 
tivity for which Fourier's law is valid at nano-level as well, 
at least as far as the DNA is concerned. This is in contrast 
to earlier models of heat conduction along carbon nanotubes 
and nanoribbons that predicted anomalous thermal conduc- 
tivity - divergence of the coefficient of thermal conductiv- 
ity with sample length[10-13]. Compared to nanotubes, the 
DNA double helix is much softer, which leads to strongly 
nonlinear behavior at T = 300 K (in contrast, a nanotube is 
a rigid quasi-one-dimensional structure, with only weak non- 
linear dynamics). Contribution of nonlinearity to the DNA 
dynamics will be explored in more detail in the following sec- 
tion. 



VIII. DEPENDENCE OF THE THERMAL 
CONDUCTIVITY ON TEMPERATURE 

At T = 300 K the DNA double helix exhibits high- 
amplitude vibrations (the amplitudes can be estimated from 
Fig. 4 and 5). The contribution of nonlinearity to the DNA 
dynamics can be estimated from the temperature dependence 
of dimensionless heat capacity 



c(T) 



1 



36Nk B T dT 



E(T), 



(21) 



where E(T) = (H) is average double helix energy at temper- 
ature T. For a harmonic system, dimensionless heat capacity 
c(T) = 1; for a system with strong anharmonism c(T) < 1, 
and c(T) > 1 for weakly anharmonic systems. As seen 
from Fig. 9, heat capacity of the double helix equals to 1 for 
low temperatures (T < 10 K) and increases monotonously 
when the temperature grows. The heat capacity c = 1.05 at 
T = 300 K, implying weak anharmonism. 

The role of nonlinearity decreases monotonously as the 
temperature decreases. In the limiting case T^O the double 
helix becomes harmonic. Therefore, classical thermal con- 
ductivity has to increase monotonously as the temperature de- 
creases, and diverge when T 0. The results of our numer- 
ical modeling confirm this conclusion - see Fig. 9 (b), curve 
3. At T \ the heat conductivity k/oo. 

We should mention that the temperature dependence of 
the DNA thermal conductivity found above is obtained with 
the framework of classical molecular-dynamics model, which 
does not take into account quantum effects of "frozen" high- 
frequency oscillations (to take those into account requires sub- 
stantial modifications to the model[92, 93]). In crystals at 
low temperatures, thermal conductivity decays monotonically 
when T — > 0. This is explained by the fact that at low temper- 
atures the temperature dependence of thermal conductivity is 
defined mainly by the temperature dependence of heat capac- 
ity. 

In classical mechanics, heat capacity of phonons does 
not depend on temperature, whereas in quantum mechan- 
ics such a dependence is defined by the formula c(uj,T) = 
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T(K) 

FIG. 9: (a) Temperature dependence of dimensionless specific heat 
c(T) and c q (T) (curves 1 and 2, respectively); b) heat conductivity 
k(T) and K q (T) (curves 3 and 4, respectively) of the DNA double 
helix. The dependencies c(T) and k(T) are obtained in the frame- 
work of classical molecular dynamics model, while c q (T) and K q (T) 
are computed within the quantum framework. 

^B-pEtw, T), where the Einstein function 

TP ( T\=( — V cxP^/fcgJ 1 ) 
E[UJ ' ' \k B T) [fflq>(fej/fcBT)-l] 2 ' 

where uj is the phonon frequency (0 < Fe < 1, function 
F E \ for T \ and Fe / 1 for T / oo). 

As seen from the DNA dispersion curves {(^(g)}?^, 
the main contribution in the heat conductivity is determined 
by the 20 low-frequencies phonons (16 high-frequencies 
phonons have very small group velocities, and therefore can 
not be efficient energy carriers). The temperature dependence 
of dimensionless heat capacity of low frequencies phonons 
can be found using formula 

c 9 (r) = — ]T / F E (uH(q),T)dq. (22) 

207T ^ J 

One can see from Fig. 9 that the heat capacity c q does not 
noticeably depend on temperature if T > 150 K, and tends 
monotonously to zero as the temperatures decrease below 
T < 150 K. 

Thus, thermal vibrations of the double helix can be de- 
scribed classically for T > 150 K only. For lower tem- 
peratures, quantum effect caused by "freezing out" of high- 
frequency vibrations must be taken into account. Due to these 



effects the DNA heat capacity (22) tends monotonously to 
zero as the temperature decreases. The double helix ther- 
mal conductivity n q (T) w c 9 (T)k(T), (where the tempera- 
ture dependence k(T) is calculated classically) because the 
phonon energy is proportional to heat capacity. As it seen 
form Fig. 9 (b) at T > 30 K the thermal conductivity K q 
grows monotonously as the temperature decreases, reaching 
its maximum at T m 30 K, and then decreases monotonously 
as T -> 0. 

These calculations show that heat transfer in the DNA oc- 
curs mainly due to propagation of low-frequency phonons 
(frequencies oj < 175 cm -1 ), i.e., by "soft" low-frequencies 
waves. Such oscillations are strongly coupled to deformation 
of orientation angles. This fact clearly distinguishes the DNA 
double helix from the essentially rigid carbon nanotubes and 
nanoribbons. The simplest model of a one-dimensional sys- 
tem with orientational interaction is one-dimensional chain of 
interacting rotators. This chain has a finite thermal conductiv- 
ity [94, 95]. On the other hand, nanotubes and nanoribbons 
are commonly described in the one-dimensional approxima- 
tion as anharmonic Fermi Pasta Ulam (FPU) chains that lead 
to infinite heat conductivity [17, 18]. 

Thus, the double helix of a homogeneous poly-G DNA has 
a finite thermal conductivity k = 0.3 W/mK. The double he- 
lix with a nonhomogeneous (arbitrary) base sequence may be 
expected to have a smaller value of the heat conductivity co- 
efficient since the presence of inhomogeneities leads to addi- 
tional phonon scattering. Therefore, thermal conductivity of a 
generic DNA double helix, k < 0.3 W/mK, may be expected 
to be less than half of that of water heat conductivity which 
is 0.6 W/mK. This means that DNA macromolecule is a ther- 
mal insulator relative to its surrounding solution. It should be 
noted that experimentally measured thermal conductivity of 
the DNA-gold composite structure (DNA is a matrix for gold 
nano-particles) [16] gives the coefficient of thermal conduc- 
tivity 150 W/mK, which is 500 times higher than the predicted 
thermal conductivity of pure DNA. Thus, we conclude that the 
measured thermal conductivity of the DNA-gold composite is 
completely determined by the metal component, not the DNA. 



IX. CONCLUSIONS 

A coarse-grain (12CG) model of DNA double helix is pro- 
posed in which each nucleotide is represented by 6 "grains". 
The corresponding effective pair potentials are inferred from 
correlation functions obtained from classical all-atom molec- 
ular dynamics (MD) trajectories and potentials (AMBER). 
The computed structural characteristics and fluctuations of the 
double helix at T = 300 K are in reasonable agreement with 
available experimental data and earlier computations based on 
all-atom models. An analysis of dispersion curves derived 
from the coarse-grained model yields longitudinal and tor- 
sional sound velocities in close agreement with experiment. 

The numerical modeling of heat conductivity along a sin- 
gle DNA molecule shows that double DNA helix has a finite 
(normal) thermal conductivity. This means that Fourier's law 
is valid at nano-level for the DNA, i.e., coefficient of thermal 



11 



TABLE I: Masses of the three coarse grains (mi, 7712, 7713) for each 
of the base X=A, T, G, C. In units of proton mass m p 



X 


mi 


m 2 


m 3 


A 


52.230 


28.139 


53.632 


T 


51.822 


16.204 


56.974 


G 


61.731 


34.357 


53.912 


C 


39.254 


35.492 


35.254 



conductivity does not depend on the length of the DNA frag- 
ment. Single molecule DNA thermal conductivity does not 
exceed 0.3 W/mK, which is two times smaller than thermal 
conductivity of water. Thus, DNA double helix is a poor heat 
conductor. At the same time, it is known from modeling of 
heat transfer along carbon nanotubes and nanoribbons that the 
coefficient of thermal conductivity in these systems diverges 
as the specimen length grows [10-13]. The anomalous be- 
havior of thermal conductivity in long nano-objects is caused 
by their rigid structure as well as by their weakly nonlinear 
quasi one-dimensional dynamics, mostly due to rigid cova- 
lent interactions. In contrast, the DNA double-helix is a soft 
3D structure with strongly nonlinear dynamics. Based on the 
results of our coarse-grained simulations we conjecture that 
heat conduction along the double helix is due predominantly 
to weak non-valent orientational interactions. 
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Appendix A: Masses of the coarse grains. 

The mass of each of the backbone grains [P],[C3] and [CI] 
is calculated as a sum of the masses of the original atoms in- 
cluded in the grain, Fig. 2. So mrpi = 109 a.e., mra = 26 
a.e., m\ci] = 43 a.e. The distribution of the total mass of 
base X (X = A, T, G, C) between its three defining grains, 
mi, r7i2, m-3, can be found from the condition of preserving 
the total mass and preserving the position of the center of mass 
of the base. Values of the grain masses are shown in table I. 

Appendix B: The potential function. 

For convenience let's re-write the Hamiltonian of the sys- 
tem: 

H = Ek+E v +Eb+E a +Et+Ehb+E s t+E e i+E v dw ■ (Bl) 
The first term is the kinetic energy of the system: 

12N 

E k = Yl 2 Mi ^ ' (B2) 

n=l 



where the summation is over all 12 N coarse-grain particles 
(grains) in the system. 




FIG. 10: Grains involved in valent interactions. Blue lines denote 
valent (harmonic) bonds, red arcs mark valent angles, bold blue lines 
are axes of rotation in the torsion potentials. The circles marked as 
N stand for original atoms N9 on A,G bases and Nl on T,C bases 
(no coarse grains are centered on these atoms, their coordinates are 
calculated from the positions of the three grains that define the base 
plane). 

The second term E v in the Hamiltonian (Bl) stands for de- 
formation energy of "valence" (pair) bonds. Pair potentials 
have the standard form 

C/ a/3 (xi,x 2 ) = ^K al3 (\x 2 - xi | - Rap) 2 , (B3) 

where a/3 denotes types of bonded particles (for example, P 
and C3), parameter R a p is the equilibrium length, parameter 
K a p is the bond stiffness. Values of these parameters were 
obtained by analysis of all-atomic MD trajectories. These po- 
tentials are calculated for the following pairs: P and C3, C3 
and CI, C3 and P, P and CI, CI and P, P and P (from neigh- 
bouring sites). Order in a pair corresponds to direction from 
3'-end to 5'-end (see Fig. 10). The parameter values are given 
in the table II. 

The third term Eb in the Hamiltonian (Bl) describes base 
deformation energy. This term was introduced to keep all four 
points near one plane and serves to mimic valent interaction 
in nitrogen bases. Let's denote the position of CI particle by 
xi and positions of the three particles on a base by x 2 , X3 , X4. 
The deformation energy includes harmonic constraints on pair 
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TABLE II: Values of the stiffness coefficients K a p and bond lengths 
R a /3 for pair interaction potentials U a p{x.i, X2). 



a/3 


PC 3 C3C1 


C3P 


PCI CIP 


PP 


K aP (eV/A 2 ) 


9.11 8.33 


0.694 


0.66 0.781 


0.20 


R a p (A) 


2.6092 2.3657 


4.0735 


3.6745 4.8938 


6.4612 



distances and a constraint on the bending angle of the rectan- 
gle { Xi X2, X3, X4 } around its diagonal. Thus, base 7 (7=A, 
T, G, C) deformation energy is given by the following for- 
mula: 

t/ 7 (xi,x 2 ,X3,x 4 ) = -K n [(|xi - x 2 | - i? 7 i 2 ) 2 
+(|xi - x 4 | - R. fU f + (|x 2 - x 3 | - i? 723 ) 2 

-r-(|x 2 - X 4 | - i? 7 24) 2 +(|X3 - X4| - i? 7 34) 2 ] 

+e 7 (l + cos6»), (B4) 

where 9 is the angle between the two planes X1X2X4 and 
X2X3X4 (equilibrium corresponds to all four points lying on 
one plane and 9 = it). The values of potential parameters 
can be found in table III. Parameters Ryi^,...,RjS4 were de- 
fined as equilibrium distances between corresponding points 
on bases, values of parameters A' 7 and e 7 were determined 
from analysis of frequency spectrum of base oscillations in all 
atomic DNA molecular dynamics [19]. 

The fourth term E a in the Hamiltonian Bl describes the 
energy of angle deformation and has following form: 

U a {9) =e a (cos9~cos9 a ) 2 , 

This energy is calculated for following angles: C3-P-C3, 
C3-C1-N, N-Cl-P. Here N denotes a specific nitrogen atom 
atom on the base: atom N9 for bases A and G, and atom Nl 
for bases T and C. Equilibrium angle and deformation energy 
are summarized in the table IV. 

The fifth term E t in the Hamiltonian (Bl) describes tor- 
sional deformation energy. It has the form: 

U t = e t (l - cos(0 - O )) 

The first type of potential is for the torsion C3-C1-N9-C8 
(C3-C1-N1-C6) - i.e., rotations of base A, G (T, C) around 
the bond CI — N9 (Cl-Nl). The second type of potential is 



TABLE III: Values of parameters for potential Ux describing defor- 
mation of the base X=A, T, G, C. 



7 


A 


T 


G 


C 


Ryl2 (A) 


2.6326 


5.0291 


2.5932 


2.4826 


RylA (A) 


4.3195 


2.7007 


5.2651 


2.6896 


i? 7 23 (A) 


4.2794 


2.8651 


4.2912 


3.5882 


-R 7 24 (A) 


4.3111 


5.5150 


5.6654 


3.5014 


i? 7 34 (A) 


3.5187 


4.5399 


4.5807 


4.5523 


K-y (eV/A 2 ) 


30 


30 


30 


20 


e 7 (eV) 


100 


100 


150 


70 



TABLE IV: Values of deformation energy ex and equilibrium angle 
9x for angle potentials. 

type C3-P-C3 C3-C1-N N-Cl-P 
e a (eV) 0.5 3. 0.3 

6 a 130.15° 141.63° 87.17° 



TABLE V: Deformation energy et and equilibrium values 4>a for the 
torsional potentials. 

Potential C3-C1-N-C C3-P-C3-C1 C1-C3-P-C3 
e t (eV) 0.5 0.5 0.5 

4>o -26.21° 48.58° 



for the torsion C3-P-C3-C1, the third one for the torsion Cl- 
C3-P-C3. Parameters of these potentials are summarized in 
table V. 

The sixth term Ehb in the Hamiltonian (Bl) describes the 
energy of interaction between complementary bases. Since 
each nitrogen base is a rigid planar structure, one can restore 
positions of all of its original atoms from positions of the three 
coarse-grain atoms, as outlined in the previous section. Let's 
denote the set of coordinates of three coarse-grain atoms by 
X n with n being a number of the base-pair. One can cal- 
culate coordinates of all of the original atoms on the base: 
ri(X n ), r2(X n ), .... Hence we can use the proven all-atom 
AMBER (van der Waals and electrostatics) potentials [19] for 
hydrogen bonds and stacking interactions. Thus 

E hb = J2 V XY{X n ,Y n ) = 



U A MBER{ri{X n ) 1 r 2 {X n ), . . . ,r 1 (Y n ),r 2 (Y n ), . . .). 

n 

where Vxy{X n , Y n ) is a potential of interaction between X 
(X=A,T,G,C) base and complementary Y (Y=A,T,G,C) base. 

The main part of the hydrogen bond energy is interactions 
between atoms near the hydrogen bond - see Fig. 1 1 (a) and 
(b). Hence the number of interacting atoms can be reduced. 
Let's denote this "reduced" potential by V XY {X n , Y n ). Then 

Ehb = VxY(X n , Y n ). 

n 

The interaction energy between neighboring bases is given 

by 

E s t = ^ VxY(X n , X n+ i) + V X y(X n , 

n 

VxY(Y n ,Y n+ i) + V XY (Y n ,X n+ i). 

Atoms whose interactions are taken into account in calcu- 
lation of the interaction energy between neighbor bases are 
shown in Fig. 11 (c). 
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FIG. 1 1 : View of (a) AT base pair, (b) GC base pair (highlighted are 
atoms which contribute most to base-base interaction energy) and (c) 
two neighboring base-pairs (AT and GC). Arrows indicate parts of 
nitrogen bases whose interaction is taken into account: for bases on 
the complementary strands only those atoms that face another con- 
tribute to the interaction, while for neighboring bases on the same 
strand all of the atoms contribute. 



The eighth term E e i of the Hamiltonian (Bl) describes the 
charge-charge interactions within the double helix. Within 
our model, only the phosphate groups interact via long-range 
electrostatic forces. We assume that each [P] grain carries 
charge equal to the electron charge qp = — le, while all 
other particles are neutral. The total electrostatic energy of 
the DNA in aqueous environment (including ions) is written 
as E e i = E vac + AG so i v , where E vac represents the Coulomb 
interaction energy in vacuum, and AG so iv is defined as the 
free energy of transferring the molecule from vacuum into sol- 
vent, i.e., solvation free energy. The above decomposition is 
an approximation made by most classical (non-polarizable) 
potential. Within our model we further assume that AG so i v 
contains only the electrostatic part; this is a reasonable as- 
sumption as long as the shape of the DNA double-helix does 



not change drastically during dynamics (e.g., the strands do 
not separate), and thus changes in the "hydrophobic" part of 
AG so i v can be neglected. While computation of the Coulomb 
part of the interaction is trivial, estimation of AG so i v is not, 
due to non-trivial shape of the biomolecule. Within the frame- 
work of the continuum dielectric, linear response theory the 
principle way of estimating AG so i v is by solving the Poisson- 
Boltzmann (PB) equation with the boundary conditions deter- 
mined by the molecular surface that separates the high dielec- 
tric solvent from the low dielectric interior of the molecule. 
However, the corresponding procedures are expensive, and 
currently of limited practical use in dynamical simulations. 
We therefore resort to the so-called generalized Born model 
[65-67] (GB), which is the most widely used alternative to the 
PB treatment when speed of computation is a concern, partic- 
ularly in molecular dynamics [36], including simulations of 
nucleic acids [41, 68-77]. 

The GB model approximates AG so i v by the following for- 
mula proposed by Still et al. [65] 



AG 



solv 



4< 



tout J 4^ 



(B5) 



where e out is the dielectric constant of water, is the dis- 
tance between atoms i and j, qi is the partial charge of atom 
i, Ri is the so-called effective Born radius of atom i, and 



/ 



RiRjexpi-rl/lRiRj) 



The empirical func- 



tion is designed to interpolate between the limits of large 
r,ij ^ yjRiRj where the Coulomb law applies, and the oppo- 
site limit where the two atomic spheres fuse into one, restoring 
the famous Born formula for solvation energy of a single ion. 
The effective Born radius of an atom represents its degree of 
burial within the low dielectric interior of the molecule: the 
further away is the atom from the solvent, the larger is its ef- 
fective radius. In our model, we assume constant effective 
Born radii which we calculate once from the first principles 
[78]. The screening effects of monovalent salt are introduced 
approximately, at the Debye-Huckel level by substitution 

1 - touC 1 -> 1 - touC 1 exp(-0.73«/). 

The 0.73 pre-factor was found empirically to give the 
best agreement with the numerical PB treatment [79]. 
Here k is the Debye-Huckel screening parameter k[A _1 ]« 
0.316-y/jsaItj [mol/L]. 

Further simplifications come from the fact that we have 
only one non-zero charge species in our model, the [P] grain. 
Then, the total electrostatics energy is given by 

N P 

Eel =C +J2 V 1^) 

where the summation is performed over all different [P]- 
grains pairs where 



V q {r) = & 



1 1 



r f(r) 



(l - e^e" - 73 ^) 



(B6) 
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describes self-energy (solvation energy) of phosphate groups. 

The resulting total electrostatic potential due to a single [P] 
particle as a function of distance is shown in Fig. 12. One 
can see that for small distances r < 80A potential decreases 
with increasing distance r as r~ 3 . For long distances the fall- 
off is exponential. Thus we can introduce a cut-off distance 
Rq = 100A for the electrostatics interactions. For r > Rq 
interaction between particles is set to zero: V q = 0. 

The last term E v dw m the Hamiltonian (Bl) describes van 
der Waals interaction between different side chain [P] and 
[C3] grains. The potential depends on the distance r between 
two grains and is given by 



o- 



5 10 o 15 20 

r(A) 

FIG. 12: Electrostatics potential V q (r), equation (B6). 



Here r denotes the distance between coarse-grain [P] parti- 
cles, Ri = Rj = Rp = 2.104 A is the effective Born radius 
of phosphate particle. The coefficient C\ — 14.40061lAeV, 
£out = 78, k = 0.1 what corresponds to physiological condi- 
tions. Parameter 



Uij (r) = etj — V - 1 - cy , i, j = P, C3, 



where ey = y/etej, dij = di + dj, <Jij = tr, + <7j, energy 
parameters are ep = O.OleV, ec3 = 0.005eV, diameters are 
dp = 2.4A, dc3 = 2A, parameter a P = 1.6k, aa = 1.9A. 

In practical applications of the 12CG model one should 
keep in mind that the model was designed to describe only 
the double helical form of DNA, so it may not be appropri- 
ate to situation when melting or base openings are expected. 
This limitation is the price one pays for computational effi- 
ciency: within our model van der Waals interactions are cal- 
culated only for backbone grains that belong to separate DNA 
strands, and only nearest neighbor base pairs interact. 
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